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3 <110> APPLICANT: ADNEY, WILLIAM S 



4 DING, SHI -YOU S 

5 VINZANT, TODD B. 

6 DECKER, STEPHEN R. 

7 HIMMEL, MICHAEL E. 



9 <120> TITLE OF INVENTION: THERMAL TOLERANT EXOGLUCANASE FROM AC I DOTHERMUS 
CELLULOLYTICUS 



11 


<130> 


FILE REFERENCE: NREL 


31-38 
















13 


<140> 


CURRENT APPLICATION NUMBER: US 09/917, 384B 








14 


<141> 


CURRENT FILING DATE: 2001- 


-07-28 














16 


<160> 


NUMBER OF 


SEQ 


ID NOS: 


11 


















18 


<170> 


SOFTWARE : 


Patentln version 3.2 
















20 


<210> 


SEQ ID NO 


1 
























21 


<211> 


LENGTH: 1121 
























22 


<212> 


TYPE: 


PRT 


























23 


<213> 


ORGANISM: 


Acidothermus cellulolyt 


icus 












26 


<220> 


FEATURE : 


























27 


<221> 


NAME /KEY: 


misc_f eature 


















28 


<223> 


OTHER 


INFORMATION 


: Full-length sequence of Guxl protein 


30 


<400> 


SEQUENCE : 


1 
























32 


Met 


Pro 


Gly 


Leu 


Arg 


Arg Arg 


Leu 


Arg 


Ala 


Gly 


He 


Val 


Ser 


Ala 


Ala 


33 


1 










5 










10 










15 




36 


Ala 


Leu 


Gly 


Ser 


Leu 


Val 


Ser 


Gly 


Leu 


Val 


Ala 


Val 


Ala 


Pro 


Val 


Ala 


37 










20 










25 










30 






40 


His 


Ala 


Ala 


Val 


Thr 


Leu 


Lys 


Ala 


Gin 


Tyr 


Lys 


Asn 


Asn 


Asp 


Ser 


Ala 


41 








35 










40 










45 








44 


Pro 


Ser 


Asp 


Asn 


Gin 


He 


Lys 


Pro 


Gly 


Leu 


Gin 


Leu 


Val 


Asn 


Thr 


Gly 


45 




50 










55 










60 










48 


Ser 


Ser 


Ser 


Val 


Asp 


Leu 


Ser 


Thr 


Val 


Thr 


Val 


Arg 


Tyr 


Trp 


Phe 


Thr 


49 


65 












70 










75 










80 


52 


Arg 


Asp 


Gly 


Gly 


Ser 


Ser 


Thr 


Leu 


Val 


Tyr 


Asn 


Cys 


Asp 


Trp 


Ala 


Ala 


53 












85 










90 










95 




56 


Met 


Gly 


Cys 


Gly 


Asn 


He 


Arg 


Ala 


Ser 


Phe 


Gly 


Ser 


Val 


Asn 


Pro 


Ala 


57 










100 










105 










110 






60 


Thr 


Pro 


Thr 


Ala 


Asp 


Thr 


Tyr 


Leu 


Gin 


Leu 


Ser 


Phe 


Thr 


Gly 


Gly 


Thr 


61 








115 










120 










125 








64 


Leu 


Ala 


Ala 


Gly 


Gly 


Ser Thr Gly 


Glu 


He 


Gin 


Asn 


Arg 


Val 


Asn 


Lys 


65 




130 










135 










140 










68 


Ser 


Asp 


Trp 


Ser 


Asn 


Phe 


Asp 


Glu 


Thr 


Asn 


Asp 


Tyr 


Ser 


Tyr 


Gly 


Thr 


69 


145 












150 










155 










160 


72 


Asn 


Thr 


Thr 


Phe 


Gin 


Asp 


Trp 


Thr 


Lys 


Val 


Thr 


Val 


Tyr 


Val 


Asn 


Gly 


73 












165 










170 










175 




76 


Val 


Leu 


Val 


Trp 


Gly 


Thr 


Glu 


Pro 


Ser 


Gly 


Ala 


Thr 


Ala 


Ser 


Pro 


Ser 


77 










180 










185 










190 
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80 Ala Ser Ala Thr Pro Ser Pro Ser Ser Ser Pro Thr Thr Ser Pro Ser 

81 195 200 205 

84 Ser Ser Pro Ser Pro Ser Ser Ser Pro Thr Pro Thr Pro Ser Ser Ser 

85 210 215 220 

88 Ser Pro Pro Pro Ser Ser Asn Asp Pro Tyr lie Gin Arg Phe Leu Thr 

89 225 230 235 240 

92 Met Tyr Asn Lys lie His Asp Pro Ala Asn Gly Tyr Phe Ser Pro Gin 

93 245 250 255 

96 Gly lie Pro Tyr His Ser Val Glu Thr Leu lie Val Glu Ala Pro Asp 

97 260 265 270 

100 Tyr Gly His Glu Thr Thr Ser Glu Ala Tyr Ser Phe Trp Leu Trp Leu 

101 275 280 285 

104 Glu Ala Thr Tyr Gly Ala Val Thr Gly Asn Trp Thr Pro Phe Asn Asn 

105 290 295 300 

108 Ala Trp Thr Thr Met Glu Thr Tyr Met lie Pro Gin His Ala Asp Gin 

109 305 310 315 320 

112 Pro Asn Asn Ala Ser Tyr Asn Pro Asn Ser Pro Ala Ser Tyr Ala Pro 

113 325 330 335 

116 Glu Glu Pro Leu Pro Ser Met Tyr Pro Val Ala lie Asp Ser Ser Val 

117 340 345 350 

12 0 Pro Val Gly His Asp Pro Leu Ala Ala Glu Leu Gin Ser Thr Tyr Gly 
121 355 360 365 

124 Thr Pro Asp He Tyr Gly Met His Trp Leu Ala Asp Val Asp Asn He 

125 370 375 380 

12 8 Tyr Gly Tyr Gly Asp Ser Pro Gly Gly Gly Cys Glu Leu Gly Pro Ser 
129 385 390 395 400 

132 Ala Lys Gly Val Ser Tyr He Asn Thr Phe Gin Arg Gly Ser Gin Glu 

133 405 410 415 

136 Ser Val Trp Glu Thr Val Thr Gin Pro Thr Cys Asp Asn Gly Lys Tyr 

137 420 425 430 

140 Gly Gly Ala His Gly Tyr Val Asp Leu Phe He Gin Gly Ser Thr Pro 

141 435 440 445 

144 Pro Gin Trp Lys Tyr Thr Asp Ala Pro Asp Ala Asp Ala Arg Ala Val 

145 450 455 460 

148 Gin Ala Ala Tyr Trp Ala Tyr Thr Trp Ala Ser Ala Gin Gly Lys Ala 

149 465 470 475 480 

152 Ser Ala He Ala Pro Thr He Ala Lys Ala Ser Gin Thr Gly Asp Tyr 

153 485 490 495 

156 Leu Arg Tyr Ser Leu Phe Asp Lys Tyr Phe Lys Gin Val Gly Asn Cys 

157 500 505 510 

160 Tyr Pro Ala Ser Ser Cys Pro Gly Ala Thr Gly Arg Gin Ser Glu Thr 

161 515 520 525 

164 Tyr Leu He Gly Trp Tyr Tyr Ala Trp Gly Gly Ser Ser Gin Gly Trp 

165 530 535 540 

168 Ala Trp Arg He Gly Asp Gly Ala Ala His Phe Gly Tyr Gin Asn Pro 

169 545 550 555 560 

172 Leu Ala Ala Trp Ala Met Ser Asn Val Thr Pro Leu He Pro Leu Ser 

173 565 570 575 

176 Pro Thr Ala Lys Ser Asp Trp Ala Ala Ser Leu Gin Arg Gin Leu Glu 
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RAW SEQUENCE LISTING DATE: 09/09/2005 
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Output Set: N:\CRF4\09092005\l917384B.raw 



177 








580 










585 










590 






180 


Phe 


Tyr 


Gin 


Trp 


Leu 


Gin 


Ser 


Ala 


Glu Gly 


Ala 


He 


Ala 


Gly 


Gly 


Ala 


181 






595 










600 










605 








184 


Thr 


Asn 


Ser 


Trp 


Asn 


Gly 


Asn 


Tyr 


Gly Thr 


Pro 


Pro 


Ala 


Gly 


Asp 


Ser 


185 




610 










615 










620 










188 


Thr 


Phe 


Tyr 


Gly 


Met 


Ala 


Tyr 


Asp 


Trp 


Glu 


Pro 


Val 


Tyr 


His 


Asp 


Pro 


189 


625 










630 










635 










640 


192 


Pro 


Ser 


Asn 


Asn 


Trp 


Phe 


Gly 


Phe 


Gin 


Ala 


Trp 


Ser 


Met 


Glu 


Arg 


Val 


193 










645 










650 










655 




196 


Ala 


Glu 


Tyr 


Tyr 


Tyr 


Val 


Thr 


Gly 


Asp 


Pro 


Lys 


Ala 


Lys 


Ala 


Leu 


Leu 


197 








660 










665 










670 






200 


Asp 


Lys 


Trp 


Val 


Ala 


Trp 


Val 


Lys 


Pro 


Asn 


Val 


Thr 


Thr 


Gly 


Ala 


Ser 


201 






675 










680 










685 








204 


Trp 


Ser 


He 


Pro 


Ser 


Asn 


Leu 


Ser 


Trp 


Ser 


Gly 


Gin 


Pro 


Asp 


Thr 


Trp 


205 




690 










695 










700 










208 


Asn 


Pro 


Ser 


Asn 


Pro 


Gly 


Thr 


Asn 


Ala 


Asn 


Leu 


His 


Val 


Thr 


He 


Thr 


209 


705 










710 










715 










720 


212 


Ser 


Ser 


Gly 


Gin 


Asp 


Val 


Gly Val 


Ala 


Ala 


Ala 


Leu 


Ala 


Lys 


Thr 


Leu 


213 










725 










730 










735 




216 


Glu 


Tyr 


Tyr 


Ala 


Ala 


Lys 


Ser 


Gly 


Asp 


Thr 


Ala 


Ser 


Arg 


Asp 


Leu 


Ala 


217 








740 










745 










750 






220 


Lys 


Gly 


Leu 


Leu 


Asp 


Ser 


Met 


Trp 


Asn 


Asn 


Asp 


Gin 


Asp 


Ser 


Leu 


Gly 


221 






755 










760 










765 








224 


Val 


Ser 


Thr 


Pro 


Glu 


Thr 


Arg 


Thr 


Asp 


Tyr 


Ser 


Arg 


Phe 


Thr 


Gin 


Val 


225 




770 










775 










780 










228 


Tyr 


Asp 


Pro 


Thr 


Thr 


Gly 


Asp 


Gly 


Leu 


Tyr 


He 


Pro 


Ser 


Gly 


Trp 


Thr 


229 


785 










790 










795 










800 


232 


Gly 


Thr 


Met 


Pro 


Asn 


Gly 


Asp 


Gin 


He 


Lys 


Pro 


Gly 


Ala 


Thr 


Phe 


Leu 


233 










805 










810 










815 




236 


Ser 


lie 


Arg 


Ser 


Trp 


Tyr 


Thr 


Lys 


Asp 


Pro 


Gin 


Trp 


Ser 


Lys 


Val 


Gin 


237 








820 










825 










830 






240 


Ala 


Tyr 


Leu 


Asn 


Gly 


Gly 


Pro 


Ala 


Pro 


Thr 


Phe 


Asn 


Tyr 


His 


Arg 


Phe 


241 






835 










840 










845 








244 


Trp 


Ala 


Glu 


Ser 


Asp 


Phe 


Ala 


Met 


Ala 


Asn 


Ala 


Asp 


Phe 


Gly 


Met 


Leu 


245 




850 










855 










860 










248 


Phe 


Pro 


Ser 


Gly 


Ser 


Pro 


Ser 


Pro 


Thr 


Pro 


Ser 


Pro 


Thr 


Pro 


Thr 


Ser 


249 


865 










870 










875 










880 


252 


Ser 


Pro 


Ser 


Pro 


Thr 


Pro 


Ser 


Ser 


Ser 


Pro 


Thr 


Pro 


Ser 


Pro 


Ser 


Pro 


253 










885 










890 










895 




256 


Ser 


Pro 


Thr 


Gly 


Asp 


Thr 


Thr 


Pro 


Pro 


Ser 


Val 


Pro 


Thr 


Gly 


Leu 


Gin 


257 








900 










905 










910 






260 


Val 


Thr 


Gly 


Thr 


Thr 


Thr 


Ser 


Ser 


Val 


Ser 


Leu 


Ser 


Trp 


Thr 


Ala 


Ser 


261 






915 










920 










925 








264 


Thr 


Asp 


Asn 


Val 


Gly 


Val 


Ala 


His 


Tyr 


Asn 


Val 


Tyr 


Arg 


Asn 


Gly 


Thr 


265 




930 










935 










940 










268 


Leu 


Val 


Gly 


Gin 


Pro 


Thr 


Ala 


Thr 


Ser 


Phe 


Thr 


Asp 


Thr 


Gly 


Leu 


Ala 


269 


945 










950 










955 










960 


272 


Ala 


Gly 


Thr 


Ser 


Tyr 


Thr 


Tyr 


Thr 


Val 


Ala 


Ala 


Val 


Asp 


Ala 


Ala 


Gly 


273 










965 










970 










975 
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RAW SEQUENCE LISTING DATE: 09/09/2005 

PATENT APPLICATION: 03/09/917,3843 TIME: 09:01:38 

Input Set : A:\Sequence listing.txt 
Output Set: N:\CRF4\09092005\l917384B.raw 

276 Asn Thr Ser Ala Gin Ser Phe Ala Gly Asp Ser Asp Asp Gly lie Ala 

277 980 985 990 

280 Val Ala Ser Pro Ser Pro Ser Pro Thr Pro Thr Ser Ser Pro Ser Pro 

281 995 1000 1005 

2 84 Thr Pro Ser Pro Thr Pro Ser Pro Thr Ser Thr Ser Gly Ala Ser 
285 1010 1015 1020 

288 Cys Thr Ala Thr Tyr Val Val Asn Ser Asp Trp Gly Ser Gly Phe 

289 1025 1030 1035 

2 92 Thr Thr Thr Val Thr Val Thr Asn Thr Gly Thr Arg Ala Thr Ser 
293 1040 1045 1050 

296 Gly Trp Thr Val Thr Trp Ser Phe Ala Gly Asn Gin Thr Val Thr 

297 1055 1060 1065 

3 00 Asn Tyr Trp Asn Thr Ala Leu Thr Gin Ser Gly Lys Ser Val Thr 
301 1070 1075 1080 

3 04 Ala Lys Asn Leu Ser Tyr Asn Asn Val lie Gin Pro Gly Gin Ser 
305 1085 1090 1095 

3 08 Thr Thr Phe Gly Phe Asn Gly Ser Tyr Ser Gly Thr Asn Thr Ala 
309 1100 1105 1110 

312 Pro Thr Leu Ser Cys Thr Ala Ser 

313 1115 1120 

316 <210> SEQ ID NO: 2 

317 <211> LENGTH: 3365 

318 <212> TYPE: DNA 

319 <213> ORGANISM: Acidothermus cellulolyticus 

322 <220> FEATURE: 

323 <221> NAME/KEY: misc_f eature 

324 <223> OTHER INFORMATION: Guxl full-length coding sequence 
326 <400> SEQUENCE: 2 

32 7 atgccaggat tacgacggcg actccgcgcc ggtatcgtct cggcggcggc gttggggtcg 60 

32 9 ctggttagcg ggctcgttgc cgtcgcacca gtcgcgcacg cggcggtgac tctcaaagcg 12 0 
331 cagtataaga acaatgattc ggcgccgagt gacaaccaga tcaaaccggg tctccagttg 180 
333 gtgaataccg ggtcgtcgtc ggtggatttg tcgacggtga cggtgcggta ctggttcacc 240 
335 cgggatggtg ggtcgtcgac actggtgtac aactgtgact gggcggcgat ggggtgtggg 3 00 

33 7 aatatccgcg cctcgttcgg ctcggtgaac ccggcgacgc cgacggcgga cacctacctg 36 0 
33 9 cagttgtcgt tcactggtgg aacgttggcc gctggtgggt cgacgggtga gattcaaaac 42 0 
341 cgggtgaata agagtgactg gtcgaacttt gatgagacca atgactactc gtatgggacg 480 
343 aacaccacct tccaggactg gacgaaggtg acggtgtacg tcaacggcgt gttggtctgg 540 
345 gggaccgaac cgtccggagc gacggcgtct ccatccgcgt cggcgacgcc cagcccgtcc 6 00 
347 agttcaccga ccacgagtcc gagttcgtcc ccgtcgccga gcagcagccc gacgccgaca 660 
349 ccgagcagct cgtcgccgcc ccgtcgtcca acgacccgta catccagcgg ttcctcacga 72 0 
351 tgtacaacaa gattcacgac ccagcgaacg gctacttcag cccgcaggga attccctacc 780 
353 actcggtaga aacgctcatc gttgaggcac cggactacgg gcacgagaca acttcggagg 840 
355 cgtacagctt ctggctctgg ctcgaagcga cgtacggcgc agtgaccggc aactggacgc 900 
357 cgttcaacaa cgcctggacg acgatggaaa cgtacatgat cccgcagcac gcggaccagc 96 0 
359 cgaacaacgc gtcgtacaac cccaacagcc cggcgtcgta cgctccggaa gagccgctgc 102 0 
361 ccagcatgta cccggttgcc atcgacagca gcgtgccggt tgggcacgac ccgctcgccg 1080 
363 ccgaattgca gtcgacgtac ggcactccgg acatttacgg catgcactgg ctggccgacg 1140 
365 ttgacaacat ctacggatac ggcgacagcc ccggcggtgg ttgcgaactc ggtccttccg 1200 
367 ctaagggcgt ctcctacatc aacacattcc agcgcggctc gcaggagtcc gtctgggaga 1260 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/917 , 384B 



DATE: 09/09/2005 
TIME: 09:01:38 



369 
371 
373 
375 
377 
379 
381 
383 
385 
387 
389 
391 
393 
395 
397 
399 
401 
403 
405 



411 
413 
415 
417 
419 
421 
423 
425 
427 
429 
431 
433 
435 
437 
439 
442 
443 
444 
445 
448 
449 
450 
452 
454 
455 



Input Set : A:\Sequence listing.txt 
Output Set: N:\CRF4\09092005\l917384B.raw 

cggtcaccca gccgacgtgc gacaacggca agtacggtgg ggcgcacggc tacgtcgacc 
tgttcatcca gggttcgacg ccgccgcagt ggaagtacac cgatgccccg gacgccgacg 
cccgtgccgt ccaggctgcg tactgggcct acacctgggc atcggcgcag ggcaaggcaa 
gcgcgattgc cccgacgatc gccaaggcga gccaaaccgg cgactacctg cggtactcgc 
tctttgacaa gtacttcaag caggtcggca actgctaccc ggccagctcc 
caaccggacg ccagagcgag acctacctga tcggctggta ctacgcctgg 



gccaaggctg ggcctggcgc attggtgacg 
ttgccgcgtg ggcgatgtcg aacgtgacac 
gcgactgggc ggcgagcttg cagcgccagc 



gcgccgcgca cttcggctac 
cgctcattcc gctctcgccc 



tgccctggag 
ggcggctcaa 
cagaatccgc 
acggcaaaga 



tggagttcta ccagtggttg caatccgcgg 



aaggagccat tgcgggcggc gccaccaaca gctggaacgg caattacggg accccgccgg 



ccggagactc gaccttctac ggcatggcgt 
cgagcaacaa ctggttcggc ttccaggcgt 
acgtcaccgg cgacccgaag gccaaggcgc 
cgaatgtcac caccggtgcc tcatggtcga 
cggatacctg gaatccgagc aacccaggaa 
cgtccgggca ggacgtcggt gttgccgcgg 
caaaatccgg cgatacggcc tcgcgcgacc 
acaacgacca ggacagcctc ggtgtgagca 
tcactcaggt gtacgacccg acgactggtg 
407 ggaccatgcc caacggtgac caaatcaagc 
409 ggtacaccaa ggatccgcag tggtcgaagg 
cgacgttcaa ctaccaccgg ttctgggcgg 
ttggcatgct cttcccatcc gggtcgccca 
ccccgagccc gactccgagc agctcgccga 
acaccacgcc gccgagcgtg ccgacgggtc 
tgtcgctcag ctggaccgcg tccaccgaca 
gaaacggcac gctggtgggt cagccgacag 
ctggcacgtc gtacacgtac acagtggcgg 
agagcttcgc cggtgacagc gacgacggca 
ctccgacgtc gtccccgagc ccaacgccgt 
gcgcatcgtg cactgctacc tacgttgtca 
gacgaacacc ggcaccaggg 
tcagacggtc accaactact 
aaagaacctg agttacaaca 
cggaagttac tcaggaacaa 



ccgtgaccgt 
ttgccggtaa 
cggtgaccgc 
ttggattcaa 
gctga 
<210> 



acgactggga gccggtctac cacgacccgc 
ggtccatgga acgggttgcc gagtactact 
tgctcgacaa gtgggtcgca tgggtgaagc 
ttccgtcgaa tttgtcctgg agcggccaac 
cgaatgccaa cctgcacgtg accatcacgt 
cgctcgcgaa gacactcgag tactacgcgg 
tcgcgaaggg attgctcgac tccatgtgga 
caccggagac gcggaccgac tactctcggt 
acggcctcta catcccgtcg ggttggacgg 
cgggtgcgac cttcctgagc atccggtcct 
tgcaggcgta cctcaacggc 
agtccgactt cgcgatggcg 
gcccgacccc gagcccgact 
cgccgtcgcc cagcccgtca 
ttcaggtcac cgggacaacg 
acgtcggcgt cgcgcactac 
cgacgtcgtt cacggacacc 
ccgttgatgc ggccggtaac acgtcggcgc 
tcgccgtcgc gagcccgtcg ccgagcccga 
cgccgacacc gtcaccgacg 
atagcgactg gggtagcggc 
ccaccagtgg ctggacggtc 
ggaacaccgc gctgacgcaa 
acgtcatcca acctggtcag 
acaccgcgcc gacgctcagc 



gggcctgctc 
aacgccgatt 
ccgacgtcgt 
ccgaccggcg 
acgtcgtccg 
aacgtgtacc 
ggcctggctg 



tccaccagcg 
ttcacgacaa 
acgtggagct 
tccggaaagt 
tcgacgacct 
tgcacggcaa 



SEQ ID NO: 3 
<211> LENGTH: 34 
<212> TYPE: PRT 

<213> ORGANISM: Acidothermus cellulolyticus 
<22 0> FEATURE: 

<221> NAME /KEY : misc_feature 

<223> OTHER INFORMATION: Potential signal peptide of Guxl 
<400> SEQUENCE: 3 

Met Pro Gly Leu Arg Arg Arg Leu Arg Ala Gly lie Val Ser Ala Ala 
15 10 15 

458 Ala Leu Gly Ser Leu Val Ser Gly Leu Val Ala Val Ala Pro Val Ala 

459 20 25 30 
462 His Ala 



1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3365 
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RAW SEQUENCE LISTING ERROR SUMMARY DATE: 09/09/2005 

PATENT APPLICATION: US/09/917 , 384B TIME: 09:01:39 



Input Set : A:\Sequence listing.txt 
Output Set: N:\CRF4\09092005\l917384B.raw 

Invalid <213> Response: 

Use of "Artificial" only as "<213> Organism" response is incomplete, 

per 1.823(b) of New Sequence Rules. Valid response is Artificial Sequence. 

Seq#:8 
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